-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework cudf::find_and_replace_all to use gather-based make_strings_column #15305
Conversation
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr) | ||
{ | ||
if (input.is_empty()) { return cudf::make_empty_column(type_id::STRING); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check if (input.is_empty() or values_to_replace.empty() or replacement_values.empty())
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already handled by the caller:
https://github.com/rapidsai/cudf/pull/15305/files#diff-8c11e29e23c203bd58fd1ee8b6aca9d9b49e06edbaced8e68bbf4728f9a1ebc5R310-R312
So this check is probably not necessary. I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution. This def makes things neater and clearer. Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake approval.
/merge |
Description
Reworks
cudf::find_and_replace_all
for strings to work with long strings and enable it to support large strings.The custom kernels were replaced with a gather-based
make_strings_column
already optimized for long and short strings.Large strings will automatically be supported in
make_strings_column
in a future PR.Checklist